Continuously Maintaining Order Statistics over Data Streams ( Extended

نویسنده

  • Xuemin Lin
چکیده

A rank query is essentially to find a data element with a given rank against a monotonic order specified on data elements. It has several equivalent variations [8, 17, 30]. Rank queries over data streams have been investigated in the form of quantile computation. A φ-quantile (φ ∈ (0, 1]) of a collection of N data elements is the element with rank dφNe against a monotonic order specified on data elements. Rank and quantile queries have many applications [1, 3, 6, 7, 10, 14–16, 26, 27], including monitoring high speed networks, trends and fleeting opportunities detection in the stock market, sensor data analysis, Web ranking aggregation and log mining, etc. In these applications, they not only play very important roles in the decision making but also have been used in summarizing data distributions of data streams. The following example shows a popular tool to compare the distributions of two data sets (data streams).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sketch-based Querying of Distributed Sliding-Window Data Streams

While traditional data-management systems focus on evaluating single, adhoc queries over static data sets in a centralized setting, several emerging applications require (possibly, continuous) answers to queries on dynamic data that is widely distributed and constantly updated. Furthermore, such query answers often need to discount data that is “stale”, and operate solely on a sliding window of...

متن کامل

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window

We consider the problem of maintaining aggregates over recent elements of a massive data stream. Motivated by applications involving network data, we consider asynchronous data streams, where the observed order of data may be different from the order in which the data was generated. The set of recent elements is modeled as a sliding timestamp window of the stream, whose elements are changing co...

متن کامل

Aggregate Computation over Data Streams

Nowadays, we have witnessed the widely recognized phenomenon of high speed data streams. Various statistics computation over data streams is often required by many applications, including processing of relational type queries, data mining and high speed network management. In this paper, we provide survey for three important kinds of aggregate computations over data streams: frequency moment, f...

متن کامل

Continuous Probabilistic Skyline Queries over Uncertain Data Streams

Recently, some approaches of finding probabilistic skylines on uncertain data have been proposed. In these approaches, a data object is composed of instances, each associated with a probability. The probabilistic skyline is then defined as a set of non-dominated objects with probabilities exceeding or equaling a given threshold. In many applications, data are generated as a form of continuous d...

متن کامل

Optimization and Security of Continuous Anonymizing Data Stream

The characteristic of data stream is that it has a huge size and its data change continually, which needs to be responded quickly, since the times of query is limited. The continuous query and data stream approximate query model are introduced in this paper. Then, the query optimization of data stream and traditional database are compared such as k-anonymity methods, are designed for static dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006